LCI-INSA Linguistic Experiment for CLEF-IP Classification Track
نویسنده
چکیده
We present the experiment the LCI group has performed to prepare our submission to CLEF-IP Classification Track. In this preliminary experiment we used a part of the available target documents as test set and the rest as train set. We describe the systems AGFL used for extracting these triples and the LCS used for classification by the Winnow algorithm. We show that the use of linguistic triples in place of bags of words improves the accuracy, as well as using the names and addresses of the applicants. we found that using the complete descriptions as bags of words does not really perform better than using only abstracts and titles. Some simple mathematics show that the official measures are redundant and that R@N should be used to evaluate a ranking, P@1 to evaluate routing and that the usual precision, recall and F1 should be used on the results of a real classification, that is a selection of the classes performed internally by the classifier.
منابع مشابه
CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain
The Clef Ip track ran for the rst time within Clef 2009. The purpose of the track was twofold: to encourage and facilitate research in the area of patent retrieval by providing a large clean data set for experimentation; to create a large test collection of patents in the three main European languages for the evaluation of cross lingual information access. The track focused on the task of prior...
متن کاملCLEF-IP 2011: Retrieval in the Intellectual Property Domain
The patent system is designed to encourage disclosure of new technologies and novel ideas by granting exclusive rights on the use of inventions to their inventors, for a limited period of time. Before a patent can be granted, patent o ces around the world perform thorough searches to ensure that no previous similar disclosures were made. In the intellectual property terminology, such kind of se...
متن کاملAutomatic Prior Art Searching and Patent Encoding at CLEF-IP '10
In the intellectual property field two tasks are of high relevance: prior art searching and patent classification. Prior art search is fundamental for many strategic issues such as patent granting, freedom to operate and opposition. Accurate classification of patent documents according to the IPC code system is vital for the interoperability between different patent offices and for the prior ar...
متن کاملExperiments with Citation Mining and Key-Term Extraction for Prior Art Search
This technical note presents the system built for the IP track of CLEF 2010 based on PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS), the modular search infrastructure initially realized for CLEF IP 2009. We largely reused the system of the previous CLEF IP but at a relatively smaller scale and with the improvement of three main components: • A new citation mining tool based on C...
متن کاملAttempts to Search Czech Spontaneous Spoken Interviews - the University of West Bohemia at CLEF 2007 CL-SR track
The paper presents an overview of the system build and experiments performed for the CLEF 2007 CL-SR track by the University of West Bohemia. We have concentrated on the monolingual experiments using the Czech collection only. The approach that was successfully employed by our team in the last year's campaign (simple tf.idf model with blind relevance feedback, accompanied with solid linguistic ...
متن کامل